Statistical Disclosure Control Methods for Census Frequency Tables
نویسندگان
چکیده
This paper provides a review of common statistical disclosure control (SDC) methods implemented at Statistical Agencies for standard tabular outputs containing whole population counts from a Census (either enumerated or based on a register). These methods include record swapping on the microdata prior to its tabulation and rounding of entries in the tables after they are produced. The approach for assessing SDC methods is based on a disclosure risk–data utility framework and the need to find the balance between managing disclosure risk while maximizing the amount of information that can be released to users and ensuring high quality outputs. To carry out the analysis, quantitative measures of disclosure risk and data utility are defined and methods compared. Conclusions from the analysis show that record swapping as a sole SDC method leaves high probabilities of disclosure risk. Targeted record swapping lowers the disclosure risk, but there is more distortion to distributions. Small cell adjustments (rounding) give protection to Census tables by eliminating small cells but only one set of variables and geographies can be disseminated in order to avoid disclosure by differencing nested tables. Full random rounding offers more protection against disclosure by differencing, but margins are typically rounded separately from the internal cells and tables are not additive. Rounding procedures protect against the perception of disclosure risk compared to record swapping since no small cells appear in the tables. Combining rounding with record swapping raises the level of protection but increases the loss of utility to Census tabular outputs. For some statistical analysis, the combination of record swapping and rounding balances to some degree opposing effects that the methods have on the utility of the tables.
منابع مشابه
WP. 33 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS EUROPEAN COMMISSION STATISTICAL OFFICE OF THE EUROPEAN COMMUNITIES (EUROSTAT)
In order to manage the disclosure risk in frequency tables containing population counts, the tables undergo statistical disclosure control (SDC) methods. This results in information loss. We examine quantitative information loss measures for frequency tables and compare them across different SDC methods. We show examples of the information loss measures on real UK 2001 Census tables after they ...
متن کاملStatistical Disclosure Control: New Directions and Challenges
Traditionally, statistical agencies generally release outputs in the form of microdata and tabular data. Microdata contain data from social surveys and tabular data contain either frequency counts, such as for census dissemination, or magnitude data typically arising from business surveys, eg. total revenue. For each of these traditional outputs, there has been much research on how to quantify ...
متن کاملMeasuring Disclosure Risk and Information Loss in Population Based Frequency Tables
Frequency tables disseminated by statistical agencies have always been of high interest. However, the agencies have to ensure that the risk of identifying individuals and disclosing individuals’ attributes from the released data is low. Therefore they assess the risk of disclosure and apply statistical disclosure control (SDC) methods if necessary. The main objective of this work is to measure ...
متن کاملGeographically intelligent disclosure control for flexible aggregation of census data
This paper describes a geographically intelligent approach to disclosure control for protecting flexibly aggregated census data. Increased analytical power has stimulated user demand for more detailed information for smaller geographical areas and customized boundaries. Consequently it is vital that improved methods of statistical disclosure control are developed to protect against the increase...
متن کاملNot for Citation or Quotation Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies
Even in the age of electronic dissemination of statistical data, tables are central data products of statistical agencies. For prominent examples, see the American FactFinder (http://factfinder.census.gov/servlet/BasicFactsServlet) from the U.S. Bureau of Census, the Office of National Statistics (http://www.statistics.gov.uk/) in the U.K., and Statistics Netherlands (http://www.cbs.nl/en/figur...
متن کامل